Cross-Lingual Information to the Rescue in Keyword Extraction
نویسندگان
چکیده
We introduce a method that extracts keywords in a language with the help of the other. In our approach, we bridge and fuse conventionally irrelevant word statistics in languages. The method involves estimating preferences for keywords w.r.t. domain topics and generating cross-lingual bridges for word statistics integration. At run-time, we transform parallel articles into word graphs, build cross-lingual edges, and exploit PageRank with word keyness information for keyword extraction. We present the system, BiKEA, that applies the method to keyword analysis. Experiments show that keyword extraction benefits from PageRank, globally learned keyword preferences, and cross-lingual word statistics interaction which respects language diversity.
منابع مشابه
Cross-Lingual Question Answering Using Common Semantic Space
With the advent of Big Data concept, a lot of attention has been paid to structuring and giving semantic to this data. Knowledge bases like DBPedia play an important role to achieve this goal. Question answering systems are common approach to address expressivity and usability of information extraction from knowledge bases. Recent researches focused only on monolingual QA systems while cross-li...
متن کاملJAVELIN III: Cross-Lingual Question Answering from Japanese and Chinese Documents
In this paper, we describe the JAVELIN Cross Language Question Answering system, which includes modules for question analysis, keyword translation, document retrieval, information extraction and answer generation. In the NTCIR6 CLQA2 evaluation, our system achieved 19% and 13% accuracy in the English-to-Chinese and English-to-Japanese subtasks, respectively. An overall analysis and a detailed m...
متن کاملMIKE: An Interactive Microblogging Keyword Extractor using Contextual Semantic Smoothing
Social media, such as tweets on Twitter and Short Message Service (SMS) messages on cellular networks, are short-length textual documents (short texts or microblog posts) exchanged among users on the Web and/or their mobile devices. Automatic keyword extraction from short texts can be applied in online applications such as tag recommendation and contextual advertising. In this paper we present ...
متن کاملExploiting Knowledge Bases for Multilingual and Cross-lingual Semantic Annotation and Search
The amount of entities in large knowledge bases (KBs) has been increasing rapidly, making it possible to propose new ways of intelligent information access. In addition, there is an impending need for systems that can enable multilingual and cross-lingual information access. In this work, we firstly demonstrate X-LiSA, an infrastructure for multilingual and cross-lingual semantic annotation, wh...
متن کاملA Knowledge Base Approach to Cross-Lingual Keyword Query Interpretation
The amount of entities in large knowledge bases available on the Web has been increasing rapidly, making it possible to propose new ways of intelligent information access. In addition, there is an impending need for technologies that can enable cross-lingual information access. As a simple and intuitive way of specifying information needs, keyword queries enjoy widespread usage, but suffer from...
متن کامل